ARL-TN-0764  •  June  2016 


US  Army  Research  Laboratory 


Mission  Driven  Scene  Understanding: 
Dynamic  Environments 

by  Arnold  Tunick 


NOTICES 

Disclaimers 

The  findings  in  this  report  are  not  to  be  eonstrued  as  an  offieial  Department  of  the 
Army  position  unless  so  designated  by  other  authorized  doeuments. 

Citation  of  manufacturer’s  or  trade  names  does  not  constitute  an  official 
endorsement  or  approval  of  the  use  thereof 

Destroy  this  report  when  it  is  no  longer  needed.  Do  not  return  it  to  the  originator. 


ARL-TN-0764  •  June  2016 


ARL 

US  Army  Research  Laboratory 


Mission  Driven  Scene  Understanding; 
Dynamic  Environments 

by  Arnold  Tunick 

Computational  and  Information  Sciences  Directorate,  ARL 


Approved  for  public  release;  distribution  unlimited. 


REPORT  DOCUMENTATION  PAGE 


Form  Approved 
0MB  No.  0704-0188 


Public  reporting  burden  for  this  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and  maintaining  the 
data  needed,  and  completing  and  reviewing  the  collection  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information,  including  suggestions  for  reducing  the 
burden,  to  Department  of  Defense,  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports  (0704-0188),  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington,  VA  22202-4302. 
Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  any  penalty  for  failing  to  comply  with  a  collection  of  information  if  it  does  not  display  a  currently  valid 
0MB  control  number. 

PLEASE  DO  NOT  RETURN  YOUR  FORM  TO  THE  ABOVE  ADDRESS. 


12.  DISTRIBUTION/ AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited. 

13.  SUPPLEMENTARY  NOTES 

14.  ABSTRACT 

Knowledge  of  how  time  and  space  changing  environmental  conditions  cause  changes  in  the  context  of  images  is  necessary  for 
scene  understanding.  Such  dynamic  environmental  conditions  (e.g.,  changing  illumination,  precipitation,  and  vegetation)  can 
modify  saliency  and  context,  obscure  features,  and  degrade  object  recognition.  Here,  context  means  more  than  the  typically 
referenced  attributes,  content,  or  composition  of  an  outdoor  scene.  For  Army  applications,  scene  understanding  needs  to  be 
viewed  in  the  context  of  providing  optimal  value  to  the  Army  mission.  Then,  for  example,  helpful  image  cues  that  relate  to 
mission  activities  may  include  time  of  day,  current  and  future  weather  conditions,  visibility,  terrain,  and  scene  location.  In  this 
report,  we  outline  progress  toward  implementing  our  mission  driven  scene  understanding  approach  to  advance  the  value  of 
Army  autonomous  intelligent  systems.  We  describe  the  proof-of-principle  installation,  setup,  and  testing  of  a  convolutional 
neural  network  (CNN)  program  developed  in  Python  and  all  its  required  software  dependencies.  While  we  found  that  the  CNN 
was  able  to  determine  the  correct  class  labels  for  images  taken  from  the  training  data  set,  the  validation  process  did  not  appear 
to  provide  optimal  results  for  images  not  previously  seen.  Thus,  we  recommend  performing  additional  trials  and  analysis  to 
better  determine  the  feasibility  of  using  the  CNN  to  augment  our  approach. 

15.  SUBJECT  TERMS 

computer  vision,  context,  saliency,  visibility,  illumination,  convolutional  neural  network 

17.  LIMITATION  18.  NUMBER 
OF  OF 

ABSTRACT  PAGES 

UU  26 

Standard  Form  298  (Rev.  8/98) 
Prescribed  by  ANSI  Std.  Z39.18 

ii 


19a.  NAME  OF  RESPONSIBLE  PERSON 

Arnold  Tunick 

19b.  TELEPHONE  NUMBER  (Include  area  code) 

(301)  394-1233 


16.  SECURITY  CLASSIFICATION  OF: 


a.  REPORT 

b.  ABSTRACT 

c.  THIS  PAGE 

Unclassified 

Unclassified 

Unclassified 

1.  REPORT  DATE  (DD-MM-YYYY) 

2.  REPORT  TYPE 

June  2016 

Technical  Note 

4.  TITLE  AND  SUBTITLE 

Mission  Driven  Scene  Understanding:  Dynamic  Environments 


6.  AUTHOR(S) 

Arnold  Tunick 


7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

US  Army  Research  Laboratory 
ATTN:  RDRL-CII-A 
2800  Powder  Mill  Road 
Adelphi,  MD  20783-1138 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 


3.  DATES  COVERED  (From  -  To) 

10/2015-06/2016 

Sa.  CONTRACT  NUMBER 

5b.  GRANT  NUMBER 

Sc.  PROGRAM  ELEMENT  NUMBER 

5d.  PROJECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

8.  PERFORMING  ORGANIZATION  REPORT  NUMBER 

ARL-TN-0764 

10.  SPONSOR/MONITOR'S  ACRONYM(S) 

11.  SPONSOR/MONITOR'S  REPORT  NUMBER(S) 


Contents 


List  of  Figures  v 

List  of  Tables  v 

Acknowledgments  vi 

1.  Introduction  1 

2.  Prerequisite  Software  Installation  2 

2.1  GIT  for  Windows  2 

2.2  Visual  Studio  Community  2013  3 

2.3  Windows  Software  Development  Kit  for  Windows  10  3 

2.4  CUDAv7.5  3 

2.5  TDM-GCC  3 

2.6  Scientific  Python  v2.7.9.4  3 

3.  Installing  Theano  VO.8.0  4 

3.1  Configuration  of  Paths  4 

3.2  Test  the  Configuration  of  Paths  5 

3.3  Link  Library  for  GCC  5 

3.4  Setup/Install  Theano  5 

3.5  Test  Theano:  CPU  5 

3.6  Test  Theano:  GPU  6 

3.7  Additional  Theano  Test  6 

4.  AlexNet  CNN  Implementation  with  Theano  7 

4.1  PIP  7 

4.2  Pycuda  7 

4.3  Hickle  8 

4.4  Pylearn2  8 

4.5  Theano-Alexnet  8 

Approved  for  public  release;  distribution  unlimited. 

iii 


4.6  Prepare  and  Preprocess  ImageNet  Data  8 

4.6.1  Set  Configurations  Paths  for  AlexNet  9 

4.6.2  Preprocessed  ImageNet  Data  for  Theano-AlexNet  10 

4.7  Train  Theano-AlexNet  11 

5.  Summary  and  Conclusions  13 

References  14 

Distribution  List  18 


Approved  for  public  release;  distribution  unlimited. 


IV 


List  of  Figures 


Fig.  1  A  few  example  images  from  a  subset  of  the  ImageNet  ILSVRC2012 
data  used  for  the  short  trial  training  of  the  Theano-AlexNet  code  that 
illustrate  time-  and  space-varying  environmental  conditions  in  outdoor 
scenes,  such  as  variations  in  illumination,  vegetation,  terrain,  and 
visibility . 11 

List  of  Tables 

Table  1  Folders  generated  in  C:\SciSoft\Git\theano_alexnet\scratch\ilsvrcl2\  10 

Table  2  Files  generated  in 

C:\SciSoft\Git\theano_alexnet\scratch\ilsvrcl2\labels . 10 

Table  3  Files  generated  in 

C:\SciSoft\Git\theano_alexnet\scratch\ilsvrcl2\misc . 10 

Table  4  Building  the  model . 12 

Table  5  CNN  training  and  validation  results:  20,000  iterations . 12 


Approved  for  public  release;  distribution  unlimited. 


V 


Acknowledgments 


I  thank  RE  Meyers,  P  David,  G  Wamell,  B  Byrne,  C  Karan,  and  S  Gutstein  for 
helpful  discussions.  This  research  was  supported  by  the  US  Army  Research 
Laboratory. 


Approved  for  public  release;  distribution  unlimited. 


1.  Introduction 


Rapid  and  robust  scene  understanding  is  a  eritically  important  goal  for  the 
development  of  Army  autonomous  intelligent  systems.'  For  outdoor  natural  seenes, 
autonomous  intelligent  systems  will  need  to  quiekly  diseem  the  depth  of  view, 
navigability,  exposure  or  eoneealment  (as  it  relates  to  object  searching),  and 
transience,  that  is,  the  rate  at  which  elements  of  the  seene  or  its  environment  are 
changing  in  space  and  time.^’^  In  this  regard,  salieney  estimation  has  been  helpful 
to  computationally  identify  elements  in  a  seene  that  immediately  eapture  the  visual 
attention  of  an  observer.^’^  Several  recent  papers  have  diseussed  eoncepts 
assoeiated  with  visual  salieney  to  enhance  automated  navigation  and  seene 
exploration.^"^  Note,  however,  that  the  most  active  or  salient  objeet(s)  in  a  seene, 
by  this  definition,^’^’^  may  not  represent  the  most  important  or  meaningful  feature(s) 
of  the  scene  in  the  context  of  the  Army  mission.'"  In  other  words,  visual  salieney 
also  ean  be  used  to  highlight  key  image  eues  that  relate  to  Army  mission  aetivities.'" 
For  example,  an  automated  vision  system  may  readily  detect  changes  in  the  ground 
surface  as  a  new  or  different  objeet  in  the  field  of  view;  however,  recognizing  the 
physical  characteristics  of  the  new  surfaee  (e.g.,  shallow  or  deep  water,  thiek,  thin, 
or  melting  iee,  freezing  rain,  snow,  mud,  quieksand,  and  so  on)  and  observing  any 
changes  in  the  context  of  the  image  may  be  critically  important.'""'^  Characterizing 
interaetions  between  objects  and  the  environment  also  ean  contribute  to  physical 
scene  understanding.'^"'^ 

Furthermore,  knowledge  of  how  time  and  spaee  ehanging  environmental  eonditions 
eause  ehanges  in  the  eontext  of  images  is  neeessary  for  scene  understanding.'""'^ 
Here,  context  means  more  than  the  typieally  referenced  attributes,  eontent,  or 
eomposition  of  an  outdoor  scene. For  Army  applieations,  seene  understanding 
needs  to  be  viewed  in  the  eontext  of  providing  optimal  value  to  the  Army  mission. 
Then,  for  example,  helpful  image  cues  that  relate  to  mission  activities  may  include 
time  of  day,  eurrent  and  future  weather  eonditions,  visibility,  terrain,  and  seene 
location.  For  instance,  changing  weather  elements  on  the  battlefield  can  alter  terrain 
features  and  traffieability;  low  visibility  ean  impede  reeonnaissance  and  target 
aequisition  or  alternately  eonceal  friendly  forees  maneuvers  and  aetivities;  and 
wind  speed  and  direction  can  favor  upwind  forces  in  nuclear,  biological,  and 
chemical  attacks  or  decrease  the  effectiveness  of  downwind  forces  due  blowing 
dust,  smoke,  sand,  rain,  sleet,  or  snow.^""^^  In  faet,  any  image  eue  that  ean 
potentially  help  the  mission  should  not  be  overlooked,  since  it  will  aid  seene 
understanding  in  the  context  of  the  Army  mission.  Consequently,  due  to  bandwidth 
and/or  operations  eonstraints,  there  will  be  a  need  for  metrics  to  prioritize  image 
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cuing  that  relate  to  mission  activities.  Thus,  our  mission  driven  scene  understanding 
approach  is  designed  to  optimize  mission  suecess. 

Many  of  the  current  methods  for  scene  understanding,  like  those  that  generate 
image  deseriptions  via  automated  semantic  labeling^^  or  visual  scene 
classification,^^  are  only  beginning  to  address  changing  environmental  conditions 
(e.g.,  with  regard  to  identifying  changes  in  terrain  characteristics  to  enhance 
autonomous  navigation  proeesses).^*  Yet,  eonsidering  eontext  changes  (e.g.,  due  to 
a  ehanging  environment)  can  pose  serious  challenges  for  computer  vision 
processes,  such  as  those  associated  with  place  recognition,  navigation,  road/terrain 
detection,  and  scene  exploration.^^"^^  This  is  beeause  rain,  snow,  and  fog  weather 
events,  as  well  as  smoke,  haze,  or  other  changes  in  lighting  and  visibility  can 
modify  saliency  and  context  of  an  outdoor  scene,  obscure  features,  and  significantly 
degrade  object  recognition.^"^"^^  Naturally,  scene-depicted  environmental 
eonditions  can  vary  with  time  of  day,  season,  and  location.^* 

In  this  report,  we  outline  progress  toward  implementing  our  mission  driven  scene 
understanding  approach  to  advance  the  value  of  Army  autonomous  intelligent 
systems  and  support  the  Army  mission  in  complex  and  changing  battlefield 
environments.  We  describe  the  proof-of-prineiple  installation,  setup,  and  testing  of 
a  convolutional  neural  network  (CNN)  program  developed  in  Python  and  all  its 
required  software  dependeneies.^^^^  Here,  we  suggest  that  the  CNN  could  be  tested 
initially  with  simple  single-objeet  images  and  later  on  with  more-eomplicated 
scenes,  such  as  those  illustrating  changes  in  illumination,  vegetation,  terrain,  and 
visibility. 

2.  Prerequisite  Software  Installation 

In  this  seetion,  we  outline  the  prerequisite  software  installations  to  implement  the 
Theano  program  code^^’"^*^  on  a  Windows  10  notebook  computer.  Here,  Theano  is  a 
Python  library  that  facilitates  the  efficient  evaluation  of  mathematical  expressions 
involving  multidimensional  arrays.  Alternately,  an  online  overview  for  installing 
Theano  on  Windows  can  be  found  at  https:  //  deepleaming.net/software/theano/ 
install_windows  .html#install-windows  . 

2.1  GIT  for  Windows 

To  aecess  the  GitHub  software  repository,  download  the  64-bit  version  of  GIT  from 
https:  //  github.com/git-for-windows/git/releases/tag/v2. 7. 1. windows.2  andextraet 
the  files  into  the  folder  C:\SciSoft\Git. 
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2.2  Visual  Studio  Community  2013 

To  access  a  C++  integrated  development  environment  with  64-bit  eompilers, 
download  Visual  Studio  Community  2013  from  https:  //  www  visualstudio.eom/en- 
us/news/vs2013-oommunity-vs.aspx.  Installation  and  setup  for  this  software  is  self- 
explanatory,  although  one  does  need  to  add  the  following  3  folders  to  the  path: 

1.  C:\Program  Files  (x86)\Mierosoft  Visual  Studio  12.0\VC\bin\amd64 

2.  C:\Program  Files  (x86)\Mierosoft  Visual  Studio  12.0\VC\lib\amd64 

3.  C:\Program  Files  (x86)\Mierosoft  Visual  Studio  12.0\VC\inolude 

2.3  Windows  Software  Development  Kit  for  Windows  10 

In  addition  to  Visual  Studio  12.0,  download  the  Windows  software  development 
kit  for  Windows  10  from  https:  //  dev.windows.eom/en-us/downloads/windows-lO- 
sdk  and  extraet  the  files  into  the  folder  C:\Program  Files  (x86)\Mierosoft  Visual 
Studio  12.0\VSSDK.  The  VSSDK  folder  should  also  be  added  to  the  path. 

2.4  CUDAV7.5 

To  provide  a  development  environment  for  C++  programs  implementing  graphies 
proeessing  unit  (GPU)-aeeelerated  applieations,  download  CUBA  v7.5  from  https: 
//  developer.nvidia.eom/euda-toolkit.  This  software  installation  will  require  that  a 
supported  version  Microsoft  Visual  Studio  be  found  on  the  eomputer.  If  not 
eompleted  automatieally,  the  path  ean  be  updated  to  inelude  the  following  2  folders: 

1.  C:\Program  Files \NVIDIA  GPU  Computing  Toolkit\CUDA\v7.5\ 
libnvvp 

2.  C:\Program  Files \NVIDIA  GPU  Computing  Toolkit\CUDA\v7.5\bin 

2.5  TDM-GCC 

The  Theano  eode  eompiler  requires  TDM-GCC  installation  for  either  32-  or  64-bit 
platforms.  Therefore,  one  needs  to  download  the  64-bit  version  TDM-GCC 
software  from  http:  //  tdm-goc.tdragon.net/  and  extraet  the  files  into  the  folder 
C:\SoiSoft\TDM-GCC-64. 

2.6  Scientific  Python  v2.7.9.4 

To  provide  the  neoessary  Python  oomponents  for  both  Theano^^’"^®  and  the  CNN 
AlexNet"^'’"^^  and  for  all  of  their  programs’  software  dependencies,  such  as  numpy, 
hickle,  pycuda,  pyleam2,  and  zeromq,  download  and  install  the  64-bit  version 
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Python  v2.7.9.4  from  https;  //  sourceforge.net/projects/winpython/files/ 
WinPython_2.7/2.7.9.4/  and  extract  the  files  into  the  folder  C:\SciSoft\WinPython- 
64bit-2.7.9.4. 


3.  Installing  Theano  VO.8.0 


To  provide  the  mathematical  framework  within  which  the  CNN  AlexNet  compiles, 
download  the  most  current  64-bit  version  of  Theano  (vO.8.0)  from  https:  // 
github.com/Theano/Theano  and  extract  the  files  into  the  folder 
C:\SciSoft\Git\theano.  Alternately,  one  can  download  and  install  the  Theano  files 
from  a  command  window  by  typing  the  following  at  the  prompt: 

•  C:\SciSoft\git>  git  clone  https:  //  github.com/Theano/Theano.git 

3.1  Configuration  of  Paths 


To  configure  the  system  path  for  Python  and  Visual  Studio,  save  following  shell 
script  as  C;\SciSoft\env.bat: 

REM  configuration  of  paths 

set  VSFORPYTHON="C : /Program  Files  (x8 6 ) /Microsoft  Visual  Studio 
12.0" 

set  SCISOFT=%~dpO 
REM  add  tdm  gcc  stuff 

set  PATH=%SCISOFT%TDM-GCC-64/bin; %SCISOFT%TDM-GCC-64/x86_64-w64- 
mingw32 /bin; %PATH% 

REM  add  winpython  stuff 

CALL  %SCISOFT%WinPython-64bit-2 .7.9. 4/scripts/env.bat 

REM  configure  path  for  msvc  compilers 

CALL  %VSFORPYTHON%/vcvarsall .bat  amd64 

REM  return  a  shell 

cmd.exe  /k 

Note  here  that  the  file  vcvarsall.bat,  which  is  called  within  the  env.bat  shell  script, 
should  contain  the  following  path  information: 

: amd64 

echo  Setting  environment  for  using  Microsoft  Visual  Studio  2013 
x64  tools. 

set  VCINSTALLDIR=%~dpOVC/ 

REM  set  WindowsSdkDir=%~dpOWinSDK/ 
set  WindowsSdkDir=%~dpOVSSDK/ 

if  not  exist  " %VCINSTALLDIR%bin/amd64 /cl . exe"  goto  missing 

set  PATH=%VCINSTALLDIR%Bin/amd64 ; %WindowsSdkDir%VisualStudioInteg 

ration/Tools/Bin; %PATH% 

set  INCLUDE=%VCINSTALLDIR% Include; %WindowsSdkDir%VisualStudioInte 
gration/Common/Inc; %INCLUDE% 

set  LIB=%VCINSTALLDIR%Lib/amd64 ; %WindowsSdkDir%VisualStudioIntegr 
ation/Common/Lib/x64 ; %LIB% 

set  LIBPATH=%VCINSTALLDIR%Lib/amd64 ; %WindowsSdkDir%VisualStudio 
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Integrat ion \ Common \Lib\x64 ; %LIBPATH% 
goto  :eof 

3.2  Test  the  Configuration  of  Paths 

To  test  the  path  eonfiguration,  open  the  Python  shell  in  a  eommand  window  by 
typing  C:\SoiSoft\env.bat  and  then  verify  that  the  following  programs  are  found 
by  typing  these  lines  at  the  prompt: 

.  C:\SciSoft>  where  gee 

.  C:\SciSoft>  where  gendef 

.  C:\SciSoft>  where  el 

.  C:\SciSoft>  where  nvee 

3.3  Link  Library  for  GCC 

To  ereate  a  link  library  for  GCC,  open  the  Python  shell  in  a  eommand  window  by 
typing  C:\SoiSoft\env.bat  and  then  type  the  following  at  the  eommand  window 
prompt: 

•  C:\SciSoft>  gendef  WmPython-64bit-2. 7.9. 4\python-2. 7.9. amd64\ 
python27.dll 

.  C:\SeiSoft>  dlltool  -dllname  python27.dll  -def  python27.def  -output-lib 
WinPython-64bit-2.7.9.4\python-  2.7.9.amd64\libs\libpython27.a 

3.4  Setup/lnstall  Theano _ 

Finally,  to  set  up  and  install  Theano,  open  the  Python  shell  in  a  eommand  window 
by  typing  C:\SciSoft\env.bat  and  then  type  the  following  at  the  prompt: 

.  C:\SoiSoft\Git\Theano>  python  setup .py  develop 

3.5  Test  Theano:  CPU 


To  test  whether  Theano  works  and  is  able  to  eompile  eode  for  eentral  proeessing 
unit  (CPU)  execution,  create  the  following  test  fde  (e.g.,  fdename  =  test.py): 

import  numpy  as  np 
import  time 
import  theano 

A  =  np . random. rand ( 1000 , 10000 ). astype (theano . config . floatX) 

B  =  np . random. rand ( 10000 , 1000 ). astype (theano . config . floatX) 
np  start  =  time. time () 

AB  =  A. dot (B) 
np_end  =  time. time () 
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X,Y  =  theano . tensor .matrices (' XY' ) 
mf  =  theano . function ( [X, Y] , X . dot (Y) ) 
t_start  =  time. time  0 
tAB  =  mf  (A,  B) 
t_end  =  time. time  0 

printC'NP  time:  %f[s],  theano  time:  %f[s]  % (np_end-np_start, 
t_end-t_start) ) 

Then  open  the  Python  shell  in  a  eommand  window  and  type  the  following  at  the 
prompt: 

.  C:\SoiSoft\Git\Theano>  python  test.py 
The  following  is  the  example  result: 

NP  time:  1.480863[s],  theano  time:  1.475381[s] 


3.6  Test  Theano:  GPU 

To  test  whether  Theano  works  and  is  able  to  eompile  eode  for  GPU  exeeution, 
ereate  the  file  .theanore.txt  in  C:\SeiSoft\WinPython-64bit-2. 7. 9. 4\settings  as 
follows: 

[global ] 
device  =  gpu 
REM  device  =  cpu 
floatX  =  float32 
[nvcc] 

f lags=-LC : \ SciSof t\WinPython- 64bit-2 .7.9.4 \python\2 .7.9. amd64 \ 
libs 

compiler_bindir=C : \ Program  Files  (x86) \Microsoft  Visual  Studio 
12  .  OWcXbin 

Then,  rerun  the  test.py  file  shown  in  Section  3.5. 


3.7  Additional  Theano  Test 

As  an  additional  test  of  the  Theano  eode,  open  the  Python  shell  in  a  eommand 
window  and  type  the  following  at  the  prompt: 

•  C:\SoiSoft\Git\Theano>python  C:\SoiSoft\Git\Theano\bin\theano- 
nose  -batoh=3000 
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The  following  is  the  example  result: 

#################### 

#  COLLECTING  TESTS  # 

################################### 

#  RUNNING  TESTS  IN  BATCHES  OF  3000  # 
################################### 

100%  done  in  604.919s  (failed:  0) 

#################### 

#  ALL  TESTS  PASSED  # 

#################### 

4.  AlexNet  CNN  Implementation  with  Theano 

In  this  section,  we  outline  all  of  the  prerequisite  software  installations  to  implement 

42 

the  AlexNet  CNN  program  code  within  Theano  on  a  Windows  10  notebook 
computer.  Alternately,  an  online  overview  of  configuring  the  paths  for  the  AlexNet 

42 

CNN,  preprocessing  image  data,  and  running  the  Python  code  can  be  found  at 
https:  //  github.com/uoguelph-mlrg/theano_alexnet. 

4.1  PIP 

An  alternate  way  to  install  the  Python  site  packages  (e.g.,  pycuda)  is  to  download 
get-pip.py  from  https:  //  pip.pypa.io/en/stable/installing/,  which  can  be  extracted 
into  the  folder  C:\SciSoft\WinPython-64bit-2. 7. 9. 4\python-2. 7. 9. amd64\Scripts. 
Then  to  install  PIP,  open  the  Python  shell  C:\SciSoft\env.bat  in  a  command 
window  and  type  the  following: 

•  C:\SciSoft\WinPython-64bit-2. 7. 9. 4\python-2. 7. 9. amd64\ 

Scripts>  python  get-pip.py 

4.2  Pycuda 

To  install  this  dependent  Python  site  package,  download  the  file  “pycuda- 
2015.1.3+cuda7518-cp27-none-win_amd64.whl”  from  http:  //  www  lfd.uci.edu/ 
~gohlke/pythonlibs/#pycuda  and  copy  it  to  the  folder  C:\SciSoft\WinPython-64bit- 
2.7.9.4\settings\pipwin\.  Then  to  install  pycuda,  open  the  Python  shell  in 
C:\SciSoft\env.bat  and  then  at  the  command  prompt  type  the  following: 

•  C:\SciSoft>pip  install  C:\SciSoft\WinPython-64bit-2. 7. 9. 4\settings\ 
pipwin\pycuda-20 15.1 . 3+cuda7 5 1 8  -cp27-  none- win_amd64  .whl 
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It  is  necessary  to  install  several  required  C++  libraries  prior  to  completing  the  steps 
for  installing  pycuda,  as  outlined  above.  Here,  one  needs  to  download 
boost_l_59_0-msvc-12.0-64.exe  from  https:  //  sourceforge.net/projects/boost/files/ 
boost-binaries/  and  then  double  click  on  the  file  to  install  boost  in  the  folder 
C:\local\boost_l_59_0. 

4.3  Hickle 

To  install  this  dependent  Python  site  package,  download  hickle  from  https:  // 
github.com/telegraphic/hickle  and  then  open  the  Python  shell  in 
C:\SciSoft\env.bat  and  then  type  the  following  at  the  command  window  prompt: 

.  C:\SciSoft>  cd  C:\SciSoft\WinPython-64bit-2.7.9.4\python- 
2.7.9.  amd64  \  Lib  \  site-p  ackages  \hickle 

•  C:\SciSoft\WinPython-64bit-2. 7.9. 4\python-2. 7.9. amd64\Lib\site- 
packages\hickle>  python  setup. py  install 

4.4  Pylearn2 

To  install  this  dependent  Python  site  package,  download  pylearn2  from  https:  // 
github.com/lisa-lab/pyleam2  and  then  open  the  Python  shell  in  C:\SciSoft\env.bat 
and  then  type  the  following  at  the  command  window  prompt: 

.  C:\SciSoft>  cd  C:\SciSoft\WinPython-64bit-2.7.9.4\python- 
2.7.9.  amd64  \  Lib  \  site-p  ackages  \pylearn2 

•  C:\SciSoft\WinPython-64bit-2. 7.9. 4\python-2. 7.9. amd64\Lib\site- 
packages\pyleam2>  python  setup  .py  install 

4.5  Theano-Alexnet 

Download  Theano-Alexnet  from  https:  //  github.com/uoguelph-mlrg/ 
theano  alexnet  and  extract  files  into  the  folder:  C:\SciSoft\Git\theano_alexnet\. 

4.6  Prepare  and  Preprocess  ImageNet  Data 

43 

To  prepare  and  preprocess  ImageNet  data,  register  and  download  the  ImageNet 
Large  Scale  Visual  Recognition  Challenge  2012  (ILSVRC2012)  image  data  .tar 
files  and  the  2014  development  kit  from  http:  //  www  image-net.org  into  the 
following  3  folders: 

•  C:\SciSoft\Git\theano_alexnet\mnt\data\datasets\ilsvrc_2014\ILSVRC 
2012  DEL  train 
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•  C:\SciSoft\Git\theano_alexnet\mnt\data\datasets\ilsvrc_2014\ILSVRC 
2012_DET_val 

•  C:\SciSoft\Git\theano_alexnet\mnt\data\datasets\ilsvrc_2014\ILSVRC 
2014_devkit 

After  downloading  the  image  data,  open  the  Python  shell  C:\SeiSoft\env.bat  and 
in  the  eommand  window  run  the  seript  C:\SciSoft\Git\theano_alexnet\ 
preprocessing\generate_data.sh,  which  will  call  3  Python  scripts.  This  program 
runs  for  about  1-2  days.  Alternately,  for  a  short  trial  of  the  AlexNet  code,  run  the 
script  C:\SciSoft\Git\theano_alexnet\preprocessmg\generate_toy_data.sh, 
which  takes  about  10  min. 

4.6.1  Set  Configurations  Paths  for  AlexNet 

Prior  to  preprocessing  the  image  data,  modify  the  path  information  in  the  file 
C:\SciSoft\Git\theano_alexnet\preprocessing\path.yaml  as  follows  and  be  sure  to 
make  similar  path  annotations  in  the  file  C:\SciSoft\Git\theano_alexnet\ 
spec_lgpu.yaml: 

#  dir  that  contains  folders  like  n01440764,  n01443537,  ... 

train_img_dir : ' C : \SciSof t\Git\theano_alexnet\mnt\data\datasets\ 
ilsvrc_2014\ILSVRC2012_DET_train\' 

#  dir  that  contains  ILSVRC2012  val_00000001~50000 . JPEG 
val_img  dir : ' C : \ Sci Soft \ Git \theano_alexnet\mnt\ data \ dataset s\ 
ilsvrc_2014\ILSVRC2012_DET_val\' 

#  dir  to  store  all  the  preprocessed  files 

tar_root_dir : ' C : \ SciSof t\Git\ theano_alexnet\ scratch\ ilsvrcl2 ' 

#  dir  to  store  training  batches 

tar_train_dir : ' C : \ SciSof t\Git\ theano_alexnet\ scratch\ ilsvrcl2 \ 
train_hkl ' 

#  dir  to  store  validation  batches 

tar_val_dir :'C:\SciSoft\Git\ \theano_alexnet\ scratch\ ilsvrcl2 \ 
val_hkl' 

#  dir  to  store  img_mean . npy,  shuffled  train_filenames . npy, 
train.txt,  val.txt 

misc_dir : ' C : \ SciSof t\ Git \theano_alexnet \ scratch\ ilsvrc 12 \misc' 
meta_clsloc_mat : ' C : \SciSoft\Git\theano_alexnet\mnt\data\datasets\ 
ilsvrc  2014\  ILSVRC2014_devkit\data\meta_clsloc .mat' 
val_label_file : ' C : \SciSoft\Git\theano_alexnet\mnt\data\datasets\ 
ilsvrc_2014\ ILSVRC2014_devkit\data\ ILSVRC2014_clsloc_validation_ 
ground_truth . txt ' 

#  training  labels 

valtxt_filename : ' C : \ SciSof t\Git\theano_alexnet\scratch\ilsvrcl2\ 
misc\val . txt' 

#  validation  labels 

traintxt_filename : ' C : \ SciSof t\Git\theano_alexnet\scratch\ilsvrcl2\ 
mi  sc \ train . txt' 
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In  addition,  in  the  file  C:\SeiSoft\Git\theano_alexnet\make_labels.py,  add  “import 
os.path”  at  the  top  of  the  file  and  replaeed  the  line  eontaining  “filename  = 
filename. split(’/’)[l]”  with  “filename  =  os.path.basename(filename)”.  Also  replaee 
the  line  eontaining  “key  =  train_filename.split(’/’)[-l]”  with  “key  = 
os.path.basename(train_filename)”.  These  eorreetions  are  neeessary  beeause  the 
Python  .split  delimiter  “/”  is  not  eompatible  with  MS  Windows  path  notations. 

4.6.2  Preprocessed  ImageNet  Data  for  Theano-AlexNet 

The  7  folders  generated  by  running  the  shorter  (-'10  min)  Python  seript  (i.e., 
generate  toy  data.sh)  to  preproeess  a  subset  of  the  ImageNet  data  for  Theano- 
AlexNet  are  shown  in  Table  1. 

Table  1  Folders  generated  in  C:\SciSoft\Git\theano_alexnet\scratch\ilsvrcl2\ 


02/24/2016 

02:43  PM 

<DIR> 

labels 

02/23/2016 

02:59  PM 

<DIR> 

misc 

02/23/2016 

05:25  PM 

<DIR> 

models 

02/23/2016 

02:58  PM 

<DIR> 

train_hkLb256_b_128 

02/23/2016 

02:58  PM 

<DIR> 

train_hkl_b256_b_256 

02/23/2016 

02:59  PM 

<DIR> 

val_hkl_b256_b_128 

02/23/2016 

02:59  PM 

<DIR> 

val_hkl_b256_b_256 

In  the  folder  C:\SeiSoft\Git\theano_alexnet\serateh\ilsvrel2\labels,  the  following  6 
files  are  generated  (Table  2). 

Table  2  Files  generated  in  C:\SciSoft\Git\tbeano_alexnet\scratcb\ilsvrcl2\labels 


02/24/2016 

02:43  PM 

5,124,748 

trainjabels.npy 

02/24/2016 

02:43  PM 

2,562,128 

trainjabels_0.npy 

02/24/2016 

02:43  PM 

2,562,128 

trainjabels_1.npy 

02/24/2016 

02:39  PM 

200,080 

valjabels.npy 

02/24/2016 

02:43  PM 

99,920 

valJabelsO.npy 

02/24/2016 

02:43  PM 

99,920 

valjabels_1  .npy 

In  the  folder  C:\SeiSoft\Git\theano_alexnet\seratch\ilsvrel2\mise,  the  following  4 
files  are  generated  (Table  3). 

Table  3  Files  generated  in  C:\SciSoft\Git\tbeano_alexnet\scratcb\ilsvrcl2\misc 


02/24/2016 

02:38 

PM 

1,572,960 

imgmean.npy 

02/24/2016 

02:54 

PM 

142,209,617 

shuffled_train_filenames.npy 

02/24/2016 

02:39 

PM 

45,110,600 

train.txt 

02/24/2016 

02:39 

PM 

1,694,500 

val.txt 

In  the  folders  train_hkl_b256_b_256  and  val_hkl_b256_b_256,  10  files 
(size  =  50,333,799  eaeh)  are  generated,  which  are  labeled  0000. hkl  through 
0009.hkl.  Note  that  each  of  these  files  contain  256  color  images  of  size  256x256, 
hence  2,560  image  files  for  training  and  validation  are  used  for  the  short  trial 
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training  of  the  Theano-AlexNet  code.  Figure  1  shows  a  few  example  images  from 
this  subset  of  the  ILSVRC2012  data  set,  which  illustrate  time-  and  space-varying 
environmental  conditions,  such  as  variations  in  illumination,  vegetation,  terrain, 
and  visibility. 


Fig.  1  A  few  example  images  from  a  subset  of  the  ImageNet  ILSVRC2012''^  data  used  for 
the  short  trial  traiuiug  of  the  Theauo-AlexNet  code  that  illustrate  time-  aud  space-varyiug 
euviroumeutal  couditious  iu  outdoor  sceues,  such  as  variatious  iu  illumiuatiou,  vegetatiou, 
terraiu,  aud  visibility 

4.7  Train  Theano-AlexNet 

Theano-AlexNet  was  tested  using  the  file  C:\SciSoft\Git\theano_alexnet\train.py  as 
follows: 

•  C:\SciSoft\Git\theano_alexnet>  python 

train.py  THEANO_FLAGS=mode=FAST_RlJN,  floatX=float32. 

In  our  first  trial,  Theano-AlexNet  initialized  properly  (Table  4)  and  then  executed 
20,000  iterations  in  about  66  h,  where  upon  the  statement  “Optimization  complete” 
was  returned  to  the  command  window  (Table  5).  Model  output  files  from  the  last 
iteration  were  generated  in  the  folder  C:\SciSoft\Git\theano_alexnet\ 


Approved  for  public  release;  distribution  unlimited. 


11 


scratch\ilsvrcl2,  which  contained  1 1  weights  and  biases  files,  respectively,  as  well 
as  22  momentum  files,  all  of  whieh  define  the  eomputations  of  the  neural  network. 

Table  4  Building  the  model 


conv  (cudnn)  layer  with  shapejn:  (3,  227,  227,  256) 

conv  (cudnn)  layer  with  shapejn:  (96,  27,  27,  256) 

conv  (cudnn)  layer  with  shapejn:  (256,  13,  13,  256) 

conv  (cudnn)  layer  with  shapejn:  (384,  13,  13,  256) 

conv  (cudnn)  layer  with  shapejn:  (384,  13,  13,  256) 

fc  layer  with  num  in:  9216  num_out:  4096  dropout 

layer  with  P  drop:  0.5 

fc  layer  with  numjn:  4096  num  out:  4096 

dropout  layer  with  P  drop:  0.5 

softmax  layer  with  numjn:  4096  num  out:  1000 


Table  5  CNN  training  and  validation  results:  20,000  iterations 


training  @  iter  =  20 

training  cost:  6.901418685916 
training  error  rate:  1 .0 
validation  loss:  6.907903 
validation  error:  99.921875  % 
training  @  iter  =  40 
training  cost:  6.89094781876 
training  error  rate:  1 .0 
validation  loss:  6.907701 
validation  error:  99.921875  % 
training  @  iter  =  60 
training  cost:  6.88030338287 
training  error  rate:  0.99609375 
validation  loss:  6.907765 
validation  error:  99.726562  % 
training  @  iter  =  80 
training  cost:  6.87519598 
training  error  rate:  1 .0 
validation  loss:  6.908174 
validation  error:  99.882812  % 


training  @  iter  =  19920 

training  cost:  4.94013977051 
training  error  rate:  0.95703125 
validation  loss:  8.612438 
validation  error:  99.6875  % 
training  @  iter  =  19940 
training  cost:  5.020860672 
training  error  rate:  0.91015625 
validation  loss:  8.665205 
validation  error:  99.765625  % 
training  @  iter  =  19960 
training  cost:  4.81143093109 
training  error  rate:  0.9375 
validation  loss:  8.659591 
validation  error:  99.804688  % 
training  @  iter  =  19980 
training  cost:  4.9219660759 
training  error  rate:  0.93359375 
validation  loss:  8.645909 
validation  error:  99.84375  % 
Optimization  compiete 


We  found  that  with  greater  numbers  of  iterations  the  training  cost  and  training  error 
rates  began  to  deerease.  In  fact,  when  we  continued  this  model  run,  exeeuting  the 
code  from  20,000  to  60,000  iterations  over  an  additional  138  h,  we  found  that  the 
training  cost  at  iteration  =  60,000  was  0.08 1 8  and  the  training  error  rate  was  0.0273, 
which  means  that  the  CNN  had  “learned”  to  assign  the  correet  elass  label  to  an 
image,  when  the  image  is  taken  from  the  training  data  set.  However,  the  validation 
loss  inereased  significantly  (i.e.,  from  6.9079  to  26.7026)  and  the  validation  errors 
after  60,000  iterations  remained  high  (i.e.,  99.6484%),  which  indicates  that  the 
CNN  is  not  assigning  the  correct  class  label  to  an  image  not  previously  seen. 
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possibly  due  to  overfitting.'^^  For  comparison,  the  validation  error  rates  achieved  by 
Ding  et  al.'^^  after  they  ran  the  Theano-AlexNet  model  using  2  GPUs  for  65  cycles 
were  42.6%  for  the  top-1  class  label  and  19.9%  for  the  top-5  class  label.  Thus,  we 
recommend  additional  trials  and  analysis  with  increased  numbers  of  training 
images  in  order  to  achieve  lower  validation  error  rates  using  the  CNN  so  that  we 
can  better  determine  the  feasibility  of  using  the  CNN  to  augment  our  approach, 
initially  with  simple  single-object  images  and  later  on  with  more  complicated 
scenes,  such  as  those  with  time-  and  space-varying  environmental  conditions.  As 
an  example,  Theano-AlexNet  could  be  trained  on  the  larger  ImageNet^^  data  set 
containing  approximately  million  images,  as  described  previously,  even  though  this 
would  require  additional  file  storage  (-500  Gb)  for  the  input  and  output  fdes  and 
additional  GPUs  to  achieve  better  computationally  efficiency  to  implement  the 
program  code. 

5.  Summary  and  Conclusions 

In  this  report,  we  outlined  progress  toward  implementing  our  mission  driven  scene 
understanding  approach  to  advance  the  value  of  Army  autonomous  intelligent 
systems.  We  described  the  proof-of-principle  installation,  setup,  and  testing  of  a 
convolutional  neural  network  (CNN)  program  developed  in  Python  and  all  of  its 
required  software  dependencies.  While  we  found  that  the  CNN  was  able  to 
determine  the  correct  class  labels  for  images  taken  from  the  training  data  set,  the 
validation  process  did  not  appear  to  provide  optimal  results  for  images  not 
previously  seen.  Thus,  we  recommend  that  additional  trials  and  analysis  be 
performed  to  better  determine  the  feasibility  of  using  the  CNN  to  augment  our 
approach,  as  described  above.  We  anticipate  that  mission  driven  scene 
understanding  VfiW  lead  to  1)  improved  autonomous  intelligent  systems  supporting 
Army  missions  in  complex  and  changing  environments  and  2)  improved  course  of 
action  strategies  based  on  scene  understanding  incorporating  battlefield  dynamic 
environments  changing  in  space  and  time. 
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